Fault Injection Verification of IBM POWER6 Soft Error Resilience
نویسندگان
چکیده
Full chip statistical fault injection has been performed on a hardware emulated POWER6 platform. These results were validated against proton beam injection results. The fault isolation, error recovery and error logging capabilities of the POWER6 enabled measurement of the architectural derating factors. The proportion of errors as well as derating factors matched well between the simulated and particle beam injection cases. The simulated and particle beam induced fault injection methods are compared and contrasted. SFI and particle beam fault injection methods complement each other to provide an overall understanding of the microarchitecture error resilience.
منابع مشابه
DECAF-FSEFI: A Fine-grained, Accountable, Flexible, and Efficient Soft Error Fault Injection Framework for Profiling Application Vulnerability
Resilient computation has been an emerging topic in the field of high-performance computing (HPC) for several years. In particular, studies show that tolerating faults on leadershipclass supercomputers (such as exascale supercomputers) is expected to be one of the main challenges. In this paper, we utilize dynamic binary instrumentation and virtual machine based fault injection to emulate soft ...
متن کاملGenerating Control Logic for Optimized Soft Error Resilience
Aggressive technology scaling has necessitated the development of techniques to ensure resilience to device faults, including soft errors, circuit wear-out, variability, and environmental effects. All error resilience techniques employ some form of redundancy, resulting in added cost such as area or power overhead. Existing selective hardening techniques have been focused on identifying on the ...
متن کاملSoft Error Resilience of Probabilistic Inference Applications
With shrinking device size and increasing complexity, soft errors are becoming an issue in the reliability of digital systems. To make efficient robust systems, it is important to understand how soft errors affect the quality of output for the target applications. Probabilistic inference applications are interesting since they produce non-exact results and yet are useful in many different field...
متن کاملException handling analysis and transformation using fault injection: Study of resilience against unanticipated exceptions
Context: In software, there are the error cases that are anticipated at specification and design time, those encountered at development and testing time, and those that were never anticipated before happening in production. Is it possible to learn from the anticipated errors during design to analyze and improve the resilience against the unanticipated ones in production? Objective: In this pape...
متن کاملHigh Performance Dense Linear System Solver with Resilience to Multiple Soft Errors
In the multi-peta-flop era for supercomputers, the number of computing cores is growing exponentially. However, as integrated circuit technology scales below 65 nm, the critical charge required to flip a gate or a memory cell has been dangerously reduced, causing higher cosmic-radiations-induced soft error rate. Soft error threatens computing system by producing silently data corruption which i...
متن کامل